Revised computational metagenomic processing uncovers hidden and biologically meaningful functional variation in the human microbiome
نویسندگان
چکیده
BACKGROUND Recent metagenomic analyses of the human gut microbiome identified striking variability in its taxonomic composition across individuals. Notably, however, these studies often reported marked functional uniformity, with relatively little variation in the microbiome's gene composition or in its overall metabolic capacity. RESULTS Here, we address this surprising discrepancy between taxonomic and functional variations and set out to track its origins. Specifically, we demonstrate that the functional uniformity observed in microbiome studies can be attributed, at least partly, to common computational metagenomic processing procedures that mask true functional variation across microbiome samples. We identify several such procedures, including commonly used practices for gene abundance normalization, mapping of gene families to functional pathways, and gene family aggregation. We show that accounting for these factors and using revised metagenomic processing procedures uncovers such hidden functional variation, significantly increasing observed variation in the abundance of functional elements across samples. Importantly, we find that this uncovered variation is biologically meaningful and that it is associated with both host identity and health. CONCLUSIONS Accurate characterization of functional variation in the microbiome is essential for comparative metagenomic analyses in health and disease. Our finding that metagenomic processing procedures mask underlying and biologically meaningful functional variation therefore highlights an important challenge such studies may face. Alternative schemes for metagenomic processing that uncover this hidden functional variation can facilitate improved metagenomic analysis and help pinpoint disease- and host-associated shifts in the microbiome's functional capacity.
منابع مشابه
A Metagenomic Analysis of Lung Microbiome in Chemically Injured and Healthy Individuals
Background and Aim: The role of the lung microbiome in respiratory complications associated with chemicals such as sulfur mustard or chlorine gas has yet to be determined. The aim of this study was to compare the structure and composition of the lung microbiome in chemically injured and healthy individuals in order to understand the relation between the population of the lung microbiota and res...
متن کاملMUSiCC: Towards an accurate estimation of average genomic copy-numbers in the human microbiome
Functional metagenomic analyses commonly involve a normalization step, where measured levels of genes or pathways are converted into relative abundances. Here, we demonstrate that this normalization scheme introduces marked biases both across and within human microbiome samples and systematically identify various sampleand gene-specific properties that contribute to these biases. We introduce a...
متن کاملHuman Microbiome
Humans are almost identical in their genetic pattern, but the slight differences in our DNA lead to remarkable phenotypic variation among the human population. There are a variety of microbial communities and their genes (microbiomes) in the human body that play an essential role in human health and disease. The microbes inhabiting our bodies is quite a bit more variable, with only a third of i...
متن کاملXander: employing a novel method for efficient gene-targeted metagenomic assembly.
BACKGROUND Metagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes. RESULTS We present a novel method for targeting assembly of specific protein-coding genes. This method combines a de Bruijn graph, as used in standard assembly appr...
متن کاملA Massively Parallel Sequence Similarity Search for Metagenomic Sequencing Data
Sequence similarity searches have been widely used in the analyses of metagenomic sequencing data. Finding homologous sequences in a reference database enables the estimation of taxonomic and functional characteristics of each query sequence. Because current metagenomic sequencing data consist of a large number of nucleotide sequences, the time required for sequence similarity searches account ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 5 شماره
صفحات -
تاریخ انتشار 2017